NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

A Study on the Calibration of In-context Learning

Zhang, Hanlin; Zhang, YiFan; Yu, Yaodong; Madeka, Dhruv; Foster, Dean; Xing, Eric; Lakkaraju, Himabindu; Kakade, Sham (June 2024, : Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers))
Duh, Kevin; Gomez, Helena; Bethard, Steven (Ed.)
Full Text Available
DataComp-LM: In search of the next generation of training sets for language models

Li, Jeffrey; Fang, Alex; Smyrnis, Georgios; Ivgi, Maor; Jordan, Matt; Gadre, Samir; Bansal, Hritik; Guha, Etash; Keh, Sedrick; Arora, Kushal; et al (April 2025, https://doi.org/10.48550/arXiv.2406.11794)

The authors introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments aimed at improving language models. DCLM provides a standardized corpus of 240T tokens extracted from Common Crawl, effective pretraining recipes based on the OpenLM framework, and a broad suite of 53 downstream evaluations. Participants can experiment with dataset curation strategies such as deduplication, filtering, and data mixing at model scales ranging from 412M to 7B parameters. As a baseline, the authors find that model-based filtering is critical for assembling a high-quality training set. Their resulting dataset, DCLM-Baseline, enables training a 7B parameter model from scratch to achieve 64% 5-shot accuracy on MMLU with 2.6T training tokens. This represents a 6.6 percentage point improvement over MAP-Neo (the previous state-of-the-art in open-data LMs), while using 40% less compute. The baseline model is also comparable to Mistral-7B-v0.3 and Llama 3 8B on MMLU (63% and 66%), and performs similarly on an average of 53 NLU tasks, while using 6.6x less compute than Llama 3 8B. These findings emphasize the importance of dataset design for training LMs and establish a foundation for further research on data curation.
more » « less
Free, publicly-accessible full text available April 21, 2026
Toward Learning Human-aligned Cross-domain Robust Models by Countering Misaligned Features

Wang, Haohan; Huang, Zeyi; Zhang, Hanlin; Lee, Yong Jae; Xing, Eric P. (January 2022, Conference on Uncertainty in Artificial Intelligence (UAI))

Machine learning has demonstrated remarkable prediction accuracy over i.i.d data, but the accuracy often drops when tested with data from another distribution. In this paper, we aim to offer another view of this problem in a perspective assuming the reason behind this accuracy drop is the reliance of models on the features that are not aligned well with how a data annotator considers similar across these two datasets. We refer to these features as misaligned features. We extend the conventional generalization error bound to a new one for this setup with the knowledge of how the misaligned features are associated with the label. Our analysis offers a set of techniques for this problem, and these techniques are naturally linked to many previous methods in robust machine learning literature. We also compared the empirical strength of these methods demonstrated the performance when these previous techniques are combined, with implementation available here
more » « less
Full Text Available
Guidelines for the use and interpretation of assays for monitoring autophagy (4th edition) ¹

https://doi.org/10.1080/15548627.2020.1797280

Klionsky, Daniel J.; Abdel-Aziz, Amal Kamal; Abdelfatah, Sara; Abdellatif, Mahmoud; Abdoli, Asghar; Abel, Steffen; Abeliovich, Hagai; Abildgaard, Marie H.; Abudu, Yakubu Princely; Acevedo-Arozena, Abraham; et al (January 2021, Autophagy)

Full Text Available

Search for: All records